Over the past few years, concerns about the environmental impact of animal agriculture have grown. It is important to understand how our food choices affect the planet. By analyzing data on the environmental impacts of different types of food, we can gain valuable insights into how these products affect the environment.
These concerns are the driving force behind this analysis. In particular, animal agriculture is a significant contributor to greenhouse gas emissions. It is responsible for global methane and nitrous oxide emissions, which are powerful greenhouse gases. In addition, animal agriculture also uses large amounts of water and land.
To carry out this analysis, we will use Pandas for data manipulation and analysis, Seaborn for data visualization, Matplotlib for creating custom plots, SciPy for statistical tests and analysis, and SciKit-Learn for machine learning.
This tutorial will guide users through a typical data science pipeline, while also providing information about the issue of animal agriculture on the environment and why it is a pressing problem.
Initially, we must collect relevant data from the internet to conduct our analysis:
We will download each respective CSV file and then use Pandas to read and manipulate them accordingly to set up for data management.
import pandas as pd # Import Pandas library
CSV1 = 'Food_Production.csv' # Initialize CSV file containing environmental impact of food production data
FOOD_PRODUCTS = ["Bread", "Maize", "Barley",
"Oatmeal", "Rice", "Potato",
"Cassava", "Sugar", "Beet Sugar",
"Pulse", "Pea", "Nut",
"Groundnut", "Soymilk", "Tofu",
"Soybean Oil", "Palm Oil", "Sunflower Oil",
"Rapeseed Oil", "Olive Oil", "Tomato",
"Onion & Leek", "Root Veg.", "Brassica",
"Veg.", "Citrus Fruit", "Banana",
"Apple", "Berry & Grape", "Wine",
"Fruit", "Coffee", "Dark Chocolate",
"Beef", "Dairy Beef", "Lamb",
"Pig", "Poultry", "Milk",
"Cheese", "Egg", "Fish", "Shrimp"] # Create list of abbreviations for clearer graph labels (current ones are too long)
df1 = pd.read_csv(CSV1) # Read CSV file and store as a Pandas dataframe
df1.loc[:, "Food product"] = FOOD_PRODUCTS # Replace column in the dataframe called "Food product" and initialize with abbreviations
df1.head() # Print the first few entries of the dataframe
| Food product | Land use change | Animal Feed | Farm | Processing | Transport | Packging | Retail | Total_emissions | Eutrophying emissions per 1000kcal (gPO₄eq per 1000kcal) | ... | Freshwater withdrawals per 100g protein (liters per 100g protein) | Freshwater withdrawals per kilogram (liters per kilogram) | Greenhouse gas emissions per 1000kcal (kgCO₂eq per 1000kcal) | Greenhouse gas emissions per 100g protein (kgCO₂eq per 100g protein) | Land use per 1000kcal (m² per 1000kcal) | Land use per kilogram (m² per kilogram) | Land use per 100g protein (m² per 100g protein) | Scarcity-weighted water use per kilogram (liters per kilogram) | Scarcity-weighted water use per 100g protein (liters per 100g protein) | Scarcity-weighted water use per 1000kcal (liters per 1000 kilocalories) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Bread | 0.1 | 0.0 | 0.8 | 0.2 | 0.1 | 0.1 | 0.1 | 1.4 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | Maize | 0.3 | 0.0 | 0.5 | 0.1 | 0.1 | 0.1 | 0.0 | 1.1 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2 | Barley | 0.0 | 0.0 | 0.2 | 0.1 | 0.0 | 0.5 | 0.3 | 1.1 | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3 | Oatmeal | 0.0 | 0.0 | 1.4 | 0.0 | 0.1 | 0.1 | 0.0 | 1.6 | 4.281357 | ... | 371.076923 | 482.4 | 0.945482 | 1.907692 | 2.897446 | 7.6 | 5.846154 | 18786.2 | 14450.92308 | 7162.104461 |
| 4 | Rice | 0.0 | 0.0 | 3.6 | 0.1 | 0.1 | 0.1 | 0.1 | 4.0 | 9.514379 | ... | 3166.760563 | 2248.4 | 1.207271 | 6.267606 | 0.759631 | 2.8 | 3.943662 | 49576.3 | 69825.77465 | 13449.891480 |
5 rows × 23 columns
Now that we have created our first dataframe containing the first dataset, let's create the next one.
CSV2 = 'meat_consumption_worldwide.csv' # Initialize CSV file containing meat consumption worldwide data
NON_COUNTRIES = ["WLD", "BRICS", "EU28", "OECD"] # Create list of non-countries to remove
df2 = pd.read_csv(CSV2) # Read CSV file and store as a Pandas dataframe
df2 = df2[~df2["LOCATION"].isin(NON_COUNTRIES)] # Remove WLD/BRICS/EU28/OECD from the dataframe, since we only want to look at individual countries
df2.head() # Print the first few entries of the dataframe
| LOCATION | SUBJECT | MEASURE | TIME | Value | |
|---|---|---|---|---|---|
| 0 | AUS | BEEF | KG_CAP | 1991 | 27.721815 |
| 1 | AUS | BEEF | KG_CAP | 1992 | 26.199591 |
| 2 | AUS | BEEF | KG_CAP | 1993 | 26.169094 |
| 3 | AUS | BEEF | KG_CAP | 1994 | 25.456134 |
| 4 | AUS | BEEF | KG_CAP | 1995 | 25.340226 |
We are done reading our data, so we can continue to the next stage where we will deal with the NaN values.
To ensure our data can be used coherently in a representation through graphs, we must manage it by creating new dataframes that can be easily graphed using Seaborn.
We will split df1 into two dataframes: df3 and df4. df3 will contain the first 9 columns, which are data calculated relative to each other. df4 will contain all the columns after, which are data calculated by a certain metric. This will be clearer when we graph the data.
END = 9 # Set the end range
df3 = df1.iloc[:, :END] # Keep the first 9 columns and store it as df1
df3.head() # Print the first few entries of the dataframe
| Food product | Land use change | Animal Feed | Farm | Processing | Transport | Packging | Retail | Total_emissions | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Bread | 0.1 | 0.0 | 0.8 | 0.2 | 0.1 | 0.1 | 0.1 | 1.4 |
| 1 | Maize | 0.3 | 0.0 | 0.5 | 0.1 | 0.1 | 0.1 | 0.0 | 1.1 |
| 2 | Barley | 0.0 | 0.0 | 0.2 | 0.1 | 0.0 | 0.5 | 0.3 | 1.1 |
| 3 | Oatmeal | 0.0 | 0.0 | 1.4 | 0.0 | 0.1 | 0.1 | 0.0 | 1.6 |
| 4 | Rice | 0.0 | 0.0 | 3.6 | 0.1 | 0.1 | 0.1 | 0.1 | 4.0 |
START = 9 # Set the start range
END = 23 # Set the end range
RANGE = [0] + list(range(START, END)) # Set the range of columns we want
df4 = df1.copy() # Create a copy to prevent warning
df4 = df4.iloc[:, RANGE] # Keep the columns after the 9th
df4.dropna(inplace = True) # Remove all NaNs since they don't provide any data
df4.reset_index(drop = True, inplace = True) # Reset our index for clarity
df4.head() # Print the first few entries of the dataframe
| Food product | Eutrophying emissions per 1000kcal (gPO₄eq per 1000kcal) | Eutrophying emissions per kilogram (gPO₄eq per kilogram) | Eutrophying emissions per 100g protein (gPO₄eq per 100 grams protein) | Freshwater withdrawals per 1000kcal (liters per 1000kcal) | Freshwater withdrawals per 100g protein (liters per 100g protein) | Freshwater withdrawals per kilogram (liters per kilogram) | Greenhouse gas emissions per 1000kcal (kgCO₂eq per 1000kcal) | Greenhouse gas emissions per 100g protein (kgCO₂eq per 100g protein) | Land use per 1000kcal (m² per 1000kcal) | Land use per kilogram (m² per kilogram) | Land use per 100g protein (m² per 100g protein) | Scarcity-weighted water use per kilogram (liters per kilogram) | Scarcity-weighted water use per 100g protein (liters per 100g protein) | Scarcity-weighted water use per 1000kcal (liters per 1000 kilocalories) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Oatmeal | 4.281357 | 11.23 | 8.638462 | 183.911552 | 371.076923 | 482.4 | 0.945482 | 1.907692 | 2.897446 | 7.60 | 5.846154 | 18786.2 | 14450.92308 | 7162.104461 |
| 1 | Rice | 9.514379 | 35.07 | 49.394366 | 609.983722 | 3166.760563 | 2248.4 | 1.207271 | 6.267606 | 0.759631 | 2.80 | 3.943662 | 49576.3 | 69825.77465 | 13449.891480 |
| 2 | Potato | 4.754098 | 3.48 | 20.470588 | 80.737705 | 347.647059 | 59.1 | 0.628415 | 2.705882 | 1.202186 | 0.88 | 5.176471 | 2754.2 | 16201.17647 | 3762.568306 |
| 3 | Nut | 3.113821 | 19.15 | 11.726883 | 672.162602 | 2531.414574 | 4133.8 | 0.069919 | 0.263319 | 2.107317 | 12.96 | 7.936314 | 229889.8 | 140777.58730 | 37380.455280 |
| 4 | Groundnut | 2.437931 | 14.14 | 5.401070 | 319.362069 | 707.524828 | 1852.3 | 0.556897 | 1.233766 | 1.570690 | 9.11 | 3.479756 | 61797.9 | 23605.00382 | 10654.810340 |
Next, we split create new dataframes from df3 and df4: df5 and df6. df5 will include the relative environmental impact values grouped by whether they're animal or plant foods. df6 will include the metric-based environmental impact values also grouped by whether they're animal or plant foods.
This is important to do so we can later compare the environmental impact of animals vs plant foods to graph and run a hypothesis test on whether the difference is significant.
PLANT_PRODUCTS = {"Bread", "Maize", "Barley",
"Oatmeal", "Rice", "Potato",
"Cassava", "Sugar", "Beet Sugar",
"Pulse", "Pea", "Nut",
"Groundnut", "Soymilk", "Tofu",
"Soybean Oil", "Palm Oil", "Sunflower Oil",
"Rapeseed Oil", "Olive Oil", "Tomato",
"Onion & Leek", "Root Veg.", "Brassica",
"Veg.", "Citrus Fruit", "Banana",
"Apple", "Berry & Grape", "Wine",
"Fruit", "Coffee", "Dark Chocolate"} # Declare a set of all plant products to categorize as "Plant"
ANIMAL_PRODUCTS = {"Beef", "Dairy Beef", "Lamb",
"Pig", "Poultry", "Milk",
"Cheese", "Egg", "Fish", "Shrimp"} # Declare a set of all animal products to categorize as "Animal"
df5 = df3.copy() # Duplicate df3 for manipulation and store it as df5
df6 = df4.copy() # Duplicate df4 for manipulation and store it as df6
df5["Food type"] = None # Create a new "Food type" column in df5 to categorize "Plant" or "Animal"
df6["Food type"] = None # Create a new "Food type" column in df6 to categorize "Plant" or "Animal"
# Iterate through df5 and then categorize individual rows as "Plant" or "Animal" depending on which set "Food product" is in
for i, r in df5.iterrows():
if r["Food product"] in PLANT_PRODUCTS:
df5.loc[i, "Food type"] = "Plant"
elif r["Food product"] in ANIMAL_PRODUCTS:
df5.loc[i, "Food type"] = "Animal"
# Iterate through df6 and then categorize individual rows as "Plant" or "Animal" depending on which set "Food product" is in
for i, r in df6.iterrows():
if r["Food product"] in PLANT_PRODUCTS:
df6.loc[i, "Food type"] = "Plant"
elif r["Food product"] in ANIMAL_PRODUCTS:
df6.loc[i, "Food type"] = "Animal"
df5 = df5.groupby("Food type").mean() # Group df5 by "Food type" and average the environmental impact values
df5["Food type"] = df5.index # Re-assign "Food type" to be a column after group by (so we can access it from its column name)
df5.reset_index(drop = True, inplace = True) # Reset our index for clarity
df5 # Print the first few entries of the dataframe
| Land use change | Animal Feed | Farm | Processing | Transport | Packging | Retail | Total_emissions | Food type | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 2.810000 | 1.95 | 10.490000 | 0.500000 | 0.240000 | 0.220000 | 0.180000 | 16.390000 | Animal |
| 1 | 0.790909 | 0.00 | 1.342424 | 0.178788 | 0.181818 | 0.284848 | 0.036364 | 2.815152 | Plant |
Next, we will do the same for df6. The code is separate so we can see both outputs.
df6 = df6.groupby("Food type").mean() # Group df6 by "Food type" and average the environmental impact values
df6["Food type"] = df6.index # Re-assign "Food type" to be a column after group by (so we can access it from its column name)
df6.reset_index(drop = True, inplace = True) # Reset our index for clarity
df6 # Print the first few entries of the dataframe
| Eutrophying emissions per 1000kcal (gPO₄eq per 1000kcal) | Eutrophying emissions per kilogram (gPO₄eq per kilogram) | Eutrophying emissions per 100g protein (gPO₄eq per 100 grams protein) | Freshwater withdrawals per 1000kcal (liters per 1000kcal) | Freshwater withdrawals per 100g protein (liters per 100g protein) | Freshwater withdrawals per kilogram (liters per kilogram) | Greenhouse gas emissions per 1000kcal (kgCO₂eq per 1000kcal) | Greenhouse gas emissions per 100g protein (kgCO₂eq per 100g protein) | Land use per 1000kcal (m² per 1000kcal) | Land use per kilogram (m² per kilogram) | Land use per 100g protein (m² per 100g protein) | Scarcity-weighted water use per kilogram (liters per kilogram) | Scarcity-weighted water use per 100g protein (liters per 100g protein) | Scarcity-weighted water use per 1000kcal (liters per 1000 kilocalories) | Food type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 58.085221 | 139.423333 | 73.289922 | 906.822192 | 1230.689663 | 2102.944444 | 10.436802 | 14.490724 | 34.723544 | 97.806667 | 51.590835 | 70855.533333 | 41650.170546 | 28049.055284 | Animal |
| 1 | 23.109872 | 20.742667 | 49.746990 | 422.777740 | 1728.610704 | 711.420000 | 5.741551 | 14.587478 | 5.421294 | 8.788667 | 19.385191 | 28073.860000 | 76078.624420 | 15746.276693 | Plant |
We decided to average the environmental impact as a way to properly compare the average imapct of animal vs plant foods on the environment. Getting the total wouldn't be as fair since there are more plant products than animal products listed.
Next, we will split df2 into three dataframes: df7, df8, and df9. df7 will contain 2022 data on relative meat consumption per country. df8 will contain data on how each country's meat consumption changes every year. df9 will contain data on how all countries' meat consumption changes every year.
df7 = df2.loc[df2["TIME"] == 2022] # Filter to contain only 2022 data and store it as df7
df7.reset_index(drop = True, inplace = True) # Reset our index for clarity
df7.head() # Print the first few entries of the dataframe
| LOCATION | SUBJECT | MEASURE | TIME | Value | |
|---|---|---|---|---|---|
| 0 | AUS | BEEF | KG_CAP | 2022 | 19.951395 |
| 1 | AUS | PIG | KG_CAP | 2022 | 20.795147 |
| 2 | AUS | POULTRY | KG_CAP | 2022 | 45.974985 |
| 3 | AUS | SHEEP | KG_CAP | 2022 | 8.606749 |
| 4 | CAN | BEEF | KG_CAP | 2022 | 18.426918 |
We should group the data by location since we want to observe the total meat consumption per country in our visualization.
df7 = df7.groupby("LOCATION").mean() # Group the data by location and average other values
df7["LOCATION"] = df7.index # Add "LOCATION" back as a column (after it's removed by groupby)
df7.reset_index(drop = True, inplace = True) # Reset the index and remove the "LOCATION" index
df7.drop(columns = ["TIME"], inplace = True) # Drop "TIME" since we're only looking at 2022
df7.head() # Print the first few entries of the dataframe
| Value | LOCATION | |
|---|---|---|
| 0 | 654.438777 | ARG |
| 1 | 395.792480 | AUS |
| 2 | 92.761958 | BGD |
| 3 | 2742.902830 | BRA |
| 4 | 421.909062 | CAN |
Since we aim to visualize this data, we need to come up with a solid range of values to separate relative meat consumption into categories so we can clearly see which countries consume more meat than others.
Let's sort this data and view it all.
df7.sort_values("Value") # Sort df7 (not inplace) and print its results
| Value | LOCATION | |
|---|---|---|
| 13 | 28.316855 | HTI |
| 22 | 40.967186 | MOZ |
| 12 | 48.032780 | GHA |
| 43 | 48.279836 | ZMB |
| 39 | 58.287311 | URY |
| 26 | 63.785447 | NZL |
| 30 | 72.225519 | PRY |
| 37 | 77.438105 | TZA |
| 11 | 84.938557 | ETH |
| 2 | 92.761958 | BGD |
| 25 | 97.187500 | NOR |
| 17 | 119.053703 | ISR |
| 9 | 119.396307 | DZA |
| 33 | 119.653849 | SDN |
| 19 | 138.128745 | KAZ |
| 5 | 149.544583 | CHE |
| 24 | 199.659620 | NGA |
| 6 | 224.346100 | CHL |
| 35 | 247.684181 | THA |
| 28 | 247.995524 | PER |
| 23 | 279.695257 | MYS |
| 32 | 286.645753 | SAU |
| 38 | 304.334280 | UKR |
| 10 | 331.591675 | EGY |
| 8 | 357.508500 | COL |
| 16 | 378.662937 | IRN |
| 36 | 395.481716 | TUR |
| 1 | 395.792480 | AUS |
| 4 | 421.909062 | CAN |
| 42 | 443.961610 | ZAF |
| 27 | 451.054780 | PAK |
| 20 | 473.225893 | KOR |
| 14 | 498.350871 | IDN |
| 29 | 524.246670 | PHL |
| 0 | 654.438777 | ARG |
| 15 | 709.425892 | IND |
| 18 | 720.685672 | JPN |
| 41 | 893.342928 | VNM |
| 21 | 1044.307995 | MEX |
| 31 | 1393.345149 | RUS |
| 34 | 1468.064750 | SSA |
| 3 | 2742.902830 | BRA |
| 40 | 5193.676467 | USA |
| 7 | 11695.816209 | CHN |
From this, we can intuitively suggest the following range values: 0, 100, 500, 1000, 5000, 10000, and 20000. This will be important when visualizing df7 later.
Next, we can create df8 and df9 accordingly.
df8 = df2.groupby(["LOCATION", "TIME"]).mean() # Group by df2 by location and time by mean
df8.reset_index(inplace = True) # Reset index for clarity (and to maintain intiail column values)
df8.head() # Print the first few entries of the dataframe
| LOCATION | TIME | Value | |
|---|---|---|---|
| 0 | ARG | 1990 | 524.747862 |
| 1 | ARG | 1991 | 393.885581 |
| 2 | ARG | 1992 | 420.682693 |
| 3 | ARG | 1993 | 447.727484 |
| 4 | ARG | 1994 | 436.872628 |
df9 = df2.groupby("TIME").mean() # Group by df2 by only time by mean
df9.reset_index(inplace = True) # Reset index for clarity (and to maintain intiail column values)
df9.head() # Print the first few entries of the dataframe
| TIME | Value | |
|---|---|---|
| 0 | 1990 | 346.211080 |
| 1 | 1991 | 366.204909 |
| 2 | 1992 | 384.500097 |
| 3 | 1993 | 401.016257 |
| 4 | 1994 | 424.126743 |
Again, we use mean because we want to see how meat consumption varies on average.
Now that we have managed and prepared all of our dataframes, we can go into the visual representation of this data. After doing this, we can analyze what the visualization means.
For our data representation, we will first create three visualizations: point plot based on relative food consumption (df3), bar plot based on metric-based food consumption (df4), and global map based on meat consumption per country (df7).
This will offer a good starting point of seeing how plant vs animal products compare and which countries are doing worse than others.
import seaborn as sns, matplotlib.pyplot as plt # Import Seaborn and Matplotlib for visualization
sns.set(rc = {"figure.figsize": (25, 25)}) # Set figure size to be large and clear
_, ((ax1, ax2), (ax3, ax4)) = plt.subplots(nrows = 2, ncols = 2) # Create subplots to place four point plots on
p1 = sns.pointplot(x = "Food product", y = "Land use change", color = "red", data = df3, ax = ax1) # Create first (red) point plot with "Food product" against "Land use change"
p1.set(xlabel = "FOOD PRODUCT", ylabel = "LAND USE CHANGE") # Give appropriate names for the axis for clearer understanding
p1.set_xticklabels(p1.get_xticklabels(), rotation = 90) # Make x axis vertical so it can fit on graph
p2 = sns.pointplot(x = "Food product", y = "Animal Feed", color = "green", data = df3, ax = ax2) # Create second (green) point plot with "Food product" against "Animal feed"
p2.set(xlabel = "FOOD PRODUCT", ylabel = "ANIMAL FEED") # Give appropriate names for the axis for clearer understanding
p2.set_xticklabels(p2.get_xticklabels(), rotation = 90) # Make x axis vertical so it can fit on graph
p3 = sns.pointplot(x = "Food product", y = "Transport", color = "blue", data = df3, ax = ax3) # Create third (blue) point plot with "Food product" against "Transport"
p3.set(xlabel = "FOOD PRODUCT", ylabel = "TRANSPORT") # Give appropriate names for the axis for clearer understanding
p3.set_xticklabels(p3.get_xticklabels(), rotation = 90) # Make x axis vertical so it can fit on graph
p4 = sns.pointplot(x = "Food product", y = "Total_emissions", color = "yellow", data = df3, ax = ax4) # Create fourth (yellow) point plot with "Food product" against "Total_emissions"
p4.set(xlabel = "FOOD PRODUCT", ylabel = "TOTAL EMISSIONS") # Give appropriate names for the axis for clearer understanding
p4.set_xticklabels(p4.get_xticklabels(), rotation = 90) # Make x axis vertical so it can fit on graph
ax1; ax2; ax3; ax4 # Print the four point plots
<matplotlib.axes._subplots.AxesSubplot at 0x7fb10e07f1f0>
The graph shows the relative plant versus animal food consumption in relation to four environmental factors: land use change, animal feed, transport, and total emissions.
_, ((ax1, ax2), (ax3, ax4)) = plt.subplots(nrows = 2, ncols = 2) # Create subplots to place four bar plots on
p1 = sns.barplot(x = "Food product", y = "Eutrophying emissions per 1000kcal (gPO₄eq per 1000kcal)", data = df4, ax = ax1) # Create first bar plot with "Food product" against "Eutrophying emissions"
p1.set(xlabel = "FOOD PRODUCT", ylabel = "EUTROPHYING EMISSIONS (gPO₄eq per 1000kcal)") # Give appropriate names for the axis for clearer understanding
p1.set_xticklabels(p1.get_xticklabels(), rotation = 90) # Make x axis vertical so it can fit on graph
p2 = sns.barplot(x = "Food product", y = "Freshwater withdrawals per 1000kcal (liters per 1000kcal)", data = df4, ax = ax2) # Create second bar plot with "Food product" against "Freshwater withdrawals"
p2.set(xlabel = "FOOD PRODUCT", ylabel = "WATER USE (liters per 1000kcal)") # Give appropriate names for the axis for clearer understanding
p2.set_xticklabels(p2.get_xticklabels(), rotation = 90) # Make x axis vertical so it can fit on graph
p3 = sns.barplot(x = "Food product", y = "Greenhouse gas emissions per 1000kcal (kgCO₂eq per 1000kcal)", data = df4, ax = ax3) # Create third bar plot with "Food product" against "Greenhouse gas emissions"
p3.set(xlabel = "FOOD PRODUCT", ylabel = "GREENHOUSE GAS EMISSIONS (kgCO₂eq per 1000kcal)") # Give appropriate names for the axis for clearer understanding
p3.set_xticklabels(p3.get_xticklabels(), rotation = 90) # Make x axis vertical so it can fit on graph
p4 = sns.barplot(x = "Food product", y = "Land use per 1000kcal (m² per 1000kcal)", data = df4, ax = ax4) # Create fourth bar plot with "Food product" against "Land use"
p4.set(xlabel = "FOOD PRODUCT", ylabel = "LAND USE (m² per 1000kcal)") # Give appropriate names for the axis for clearer understanding
p4.set_xticklabels(p4.get_xticklabels(), rotation = 90) # Make x axis vertical so it can fit on graph
ax1; ax2; ax3; ax4 # Print the four bar plots
<matplotlib.axes._subplots.AxesSubplot at 0x7fb10dcdac40>
The four bar plots show data on eutrophying emissions, water use, greenhouse gas emissions, and land use. The data is presented as a metric per 1000kcal.
import folium # Import folium to create map
from branca.element import Figure # Import Figure from branca.element to fit map
fig = Figure(width = 1250, height = 750) # Create figure to fit the map on
world_map = folium.Map(location = [42.5, 0], zoom_start = 2) # Create map and focus on all relevant countries
geo_data = "https://raw.githubusercontent.com/datasets/geo-countries/master/data/countries.geojson" # Get geojson data to mark regions of countries
columns = ["LOCATION", "Value"] # Denote the columns that the "LOCATION" will be set on and the "Value" that's measured
threshold_scale = [0, 100, 500, 1000, 5000, 10000, 20000] # Provide our intuitive threshold scale here
folium.Choropleth(
geo_data = geo_data, # Provide geo data
data = df7, # Provide dataframe (df7)
columns = columns, # Provide columns
key_on = "properties.ISO_A3", # Provide metric to figure out location regions on
fill_color = "YlOrRd", # Provide color scheme
legend_name = "Relative Meat Consumption", # Provide a clear legend name
threshold_scale = threshold_scale # Provide threshold scale
).add_to(world_map) # Add a created world map to the created map
fig.add_child(world_map) # Add map onto the figure
fig # Print the figure